Search CORE

136 research outputs found

Bot recognition in a Web store: An approach based on unsupervised learning

Author: Francesco Masulli
Grażyna Suchacka
Stefano Rovetta
Publication venue
Publication date: 01/01/2020
Field of study

Abstract Web traffic on e-business sites is increasingly dominated by artificial agents (Web bots) which pose a threat to the website security, privacy, and performance. To develop efficient bot detection methods and discover reliable e-customer behavioural patterns, the accurate separation of traffic generated by legitimate users and Web bots is necessary. This paper proposes a machine learning solution to the problem of bot and human session classification, with a specific application to e-commerce. The approach studied in this work explores the use of unsupervised learning (k-means and Graded Possibilistic c-Means), followed by supervised labelling of clusters, a generative learning strategy that decouples modelling the data from labelling them. Its efficiency is evaluated through experiments on real e-commerce data, in realistic conditions, and compared to that of supervised learning classifiers (a multi-layer perceptron neural network and a support vector machine). Results demonstrate that the classification based on unsupervised learning is very efficient, achieving a similar performance level as the fully supervised classification. This is an experimental indication that the bot recognition problem can be successfully dealt with using methods that are less sensitive to mislabelled data or missing labels. A very small fraction of sessions remain misclassified in both cases, so an in-depth analysis of misclassified samples was also performed. This analysis exposed the superiority of the proposed approach which was able to correctly recognize more bots, in fact, and identified more camouflaged agents, that had been erroneously labelled as humans

Archivio istituzionale della ricerca - Università di Genova

Open Access Repository

An automatic method for the lexical disambiguation of names

Author: Buscaldi Davide
Masulli Francesco
Rosso Paolo
Publication venue: Universidad Autónoma de Bucaramanga UNAB
Publication date: 01/06/2003
Field of study

Este artículo presenta un método completamente automático que resuelve la desambiguación léxica de nombres calculando la densidad conceptual de cada uno de los sentidos del nombre a desambiguar. La evaluación del método se ha realizado sobre el corpus SemCor con un contexto de sólo dos nombres, obteniendo una precisión de 81.5% y un recall de 60.25%.Palabras clave: desambiguación léxica de nombres, densidad conceptual.This article presents a completely automatic method that solves the lexical disambiguation of names by calculating the conceptual density of each of the senses of the name to be disambiguated. The evaluation of the method has been carried out on the SemCor corpus with a context of only two names, obtaining an accuracy of 81.5% and a recall of 60.25%. Keywords: lexical disambiguation of names, conceptual density

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Un método automático para la desambiguación léxica de nombres

Author: Buscaldi Davide
Masulli Francesco
Rosso Paolo
Publication venue: 'Universidad Autonoma de Bucaramanga'
Publication date: 01/06/2003
Field of study

Directory of Open Access Journals

UNAB Revistas Académicas

Soft ranking in clustering

Author: Aggarwal
Alon
Bortolan
Francesco Masulli
Kaufman
Kruskal
Maurizio Filippone
Ng
Pękalska
Rovetta
Shawe-Taylor
Shepard
Shepard
Sokal
Stefano Rovetta
Wang
Wang
Ward
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Due to the diffusion of large-dimensional data sets (e.g., in DNA microarray or document organization and retrieval applications), there is a growing interest in clustering methods based on a proximity matrix. These have the advantage of being based on a data structure whose size only depends on cardinality, not dimensionality. In this paper, we propose a clustering technique based on fuzzy ranks. The use of ranks helps to overcome several issues of large-dimensional data sets, whereas the fuzzy formulation is useful in encoding the information contained in the smallest entries of the proximity matrix. Comparative experiments are presented, using several standard hierarchical clustering techniques as a reference

Crossref

Archivio istituzionale della ricerca - Università di Genova

White Rose Research Online

Dietary potassium intake and risk of diabetes : a systematic review and meta-analysis of prospective studies

Author: Cappuccio Francesco P.
D’Elia Lanfranco
Galletti Ferruccio
Masulli Maria
Strazzullo Pasquale
Zarrella Aquilino F.
Publication venue: 'MDPI AG'
Publication date: 01/11/2022
Field of study

(1) Background: Dietary potassium intake is positively associated with reduction of cardiovascular risk. Several data are available on the relationship between dietary potassium intake, diabetes risk and glucose metabolism, but with inconsistent results. Therefore, we performed a meta-analysis of the prospective studies that explored the effect of dietary potassium intake on the risk of diabetes to overcome these limitations. (2) Methods: A random-effects dose–response meta-analysis was carried out for prospective studies. A potential non-linear relation was investigated using restricted cubic splines. (3) Results: A total of seven prospective studies met the inclusion criteria. Dose–response analysis detected a non-linear relationship between dietary potassium intake and diabetes risk, with significant inverse association starting from 2900 mg/day by questionnaire and between 2000 and 5000 mg/day by urinary excretion. There was high heterogeneity among studies, but no evidence of publication bias was found. (4) Conclusions: The results of this meta-analysis indicate that habitual dietary potassium consumption is associated with risk of diabetes by a non-linear dose–response relationship. The beneficial threshold found supports the campaigns in favour of an increase in dietary potassium intake to reduce the risk of morbidity and mortality. Further studies should be carried out to explore this topic

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

A survey of kernel and spectral methods for clustering

Author: Aizerman
Aronszajn
Belkin
Bengio
Bezdek
Bishop
Burges
Camastra
Chan
Chen
Chiang
Cortes
Cristianini
Cristianini
Dhillon
Dhillon
Donath
Duda
Fiedler
Fisher
Francesco Camastra
Francesco Masulli
Gersho
Girolami
Golub
Have
Horn
Huber
Hur
Jain
Kernighan
Kluger
Kohonen
Kohonen
Krishnapuram
Krishnapuram
Kulis
Lee
Leski
Linde
Lloyd
Martinetz
Maurizio Filippone
Mercer
Müller
Ng
Ritter
Rose
Roth
Roweis
Saitoh
Schölkopf
Schölkopf
Shi
Sigillito
Sneath
Stefano Rovetta
Tax
Vapnik
von Luxburg
Ward
Weston
Wolberg
Xu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

CiteSeerX

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Crossref

Enlighten

Archivio istituzionale della ricerca - Università di Genova

White Rose Research Online

Role of continuous glucose monitoring in diabetic patients at high cardiovascular risk. an expert-based multidisciplinary delphi consensus

Background: Continuous glucose monitoring (CGM) shows in more detail the glycaemic pattern of diabetic subjects and provides several new parameters (“glucometrics”) to assess patients’ glycaemia and consensually guide treatment. A better control of glucose levels might result in improvement of clinical outcome and reduce disease complications. This study aimed to gather an expert consensus on the clinical and prognostic use of CGM in diabetic patients at high cardiovascular risk or with heart disease. Methods: A list of 22 statements concerning type of patients who can benefit from CGM, prognostic impact of CGM in diabetic patients with heart disease, CGM use during acute cardiovascular events and educational issues of CGM were developed. Using a two-round Delphi methodology, the survey was distributed online to 42 Italian experts (21 diabetologists and 21 cardiologists) who rated their level of agreement with each statement on a 5-point Likert scale. Consensus was predefined as more than 66% of the panel agreeing/disagreeing with any given statement. Results: Forty experts (95%) answered the survey. Every statement achieved a positive consensus. In particular, the panel expressed the feeling that CGM can be prognostically relevant for every diabetic patient (70%) and that is clinically useful also in the management of those with type 2 diabetes not treated with insulin (87.5%). The assessment of time in range (TIR), glycaemic variability (GV) and hypoglycaemic/hyperglycaemic episodes were considered relevant in the management of diabetic patients with heart disease (92.5% for TIR, 95% for GV, 97.5% for time spent in hypoglycaemia) and can improve the prognosis of those with ischaemic heart disease (100% for hypoglycaemia, 90% for hyperglycaemia) or with heart failure (87.5% for hypoglycaemia, 85% for TIR, 87.5% for GV). The experts retained that CGM can be used and can impact the short- and long-term prognosis during an acute cardiovascular event. Lastly, CGM has a recognized educational role for diabetic subjects. Conclusions: According to this Delphi consensus, the clinical and prognostic use of CGM in diabetic patients at high cardiovascular risk is promising and deserves dedicated studies to confirm the experts’ feeling

Archivio della ricerca- Università di Roma La Sapienza

Tracking Time-Evolving Data Streams and an Application to Short-Term Urban Traffic Flow Forecasting

Author: Masulli Francesco
Publication venue: place:Noida
Publication date: 01/01/2016
Field of study

Crossref

Archivio istituzionale della ricerca - Università di Genova

Introduction to Bioinformatics Data Sets Mining Using Fuzzy Biclustering

Author: Masulli Francesco
Rovetta Stefano
Publication venue: place:Piscataway, NJ, USA
Publication date: 01/01/2009
Field of study

Archivio istituzionale della ricerca - Università di Genova